Goto

Collaborating Authors

 perfect equilibrium


Polynomial-time Approximation Scheme for Equilibriums of Games

Sun, Hongbo, Xia, Chongkun, Yuan, Bo, Wang, Xueqian, Liang, Bin

arXiv.org Artificial Intelligence

Nash equilibrium[1] of normal-form game was proposed decades ago, yet even whether PTAS exists for it remains undecided, not to mention for equilibriums of games with dynamics. PTAS for equilibriums of games is important itself in game theory, and the confirmation of its existence may impact multi-agent reinforcement learning research. First, the existence of PTAS relates to the practicality of the amount of computational power in achieving equilibriums of large scale games. It has been proved that exactly computing a Nash equilibrium of a static game is in PPAD-hard class of complexity[2]. Ignoring the possibility that PPAD itself is of polynomial-time[3], PTAS describes methods that approximately compute Nash equilibriums efficiently. Second, the confirmation of previously unknown existence of PTAS for games implies possibility to fundamentally solve the problems of non-stationarity in training and curse of dimensionality[4] in multi-agent reinforcement learning at the same time. Both the two problems are related to the absence of PTAS for equilibriums of games. Non-stationarity in training relates to the fact that existing polynomial-time methods lack convergence guarantee to equilibriums, and curse of dimensionality relates to the fact that methods with convergence guarantee lack polynomial-time complexity.


Observable Perfect Equilibrium

Ganzfried, Sam

arXiv.org Artificial Intelligence

While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.


Robustness and sample complexity of model-based MARL for general-sum Markov games

Subramanian, Jayakumar, Sinha, Amit, Mahajan, Aditya

arXiv.org Artificial Intelligence

Multi-agent reinforcement learning (MARL) is often modeled using the framework of Markov games (also called stochastic games or dynamic games). Most of the existing literature on MARL concentrates on zero-sum Markov games but is not applicable to general-sum Markov games. It is known that the best-response dynamics in general-sum Markov games are not a contraction. Therefore, different equilibria in general-sum Markov games can have different values. Moreover, the Q-function is not sufficient to completely characterize the equilibrium. Given these challenges, model based learning is an attractive approach for MARL in general-sum Markov games. In this paper, we investigate the fundamental question of \emph{sample complexity} for model-based MARL algorithms in general-sum Markov games. We show two results. We first use Hoeffding inequality based bounds to show that $\tilde{\mathcal{O}}( (1-\gamma)^{-4} \alpha^{-2})$ samples per state-action pair are sufficient to obtain a $\alpha$-approximate Markov perfect equilibrium with high probability, where $\gamma$ is the discount factor, and the $\tilde{\mathcal{O}}(\cdot)$ notation hides logarithmic terms. We then use Bernstein inequality based bounds to show that $\tilde{\mathcal{O}}( (1-\gamma)^{-1} \alpha^{-2} )$ samples are sufficient. To obtain these results, we study the robustness of Markov perfect equilibrium to model approximations. We show that the Markov perfect equilibrium of an approximate (or perturbed) game is always an approximate Markov perfect equilibrium of the original game and provide explicit bounds on the approximation error. We illustrate the results via a numerical example.


Sequential Equilibrium in Games of Imperfect Recall

Halpern, Joseph Y. (Cornell University) | Pass, Rafael (Cornell University)

AAAI Conferences

There has been a great deal of interest in AI recently in applying Nevertheless, the intuition that underlies sequential and ideas of game theory to model interacting agents who perfect equilibrium, namely, players should play optimally have possibly different preferences as to the outcome of the even off the equilibrium path, seems to make sense even interaction.


A Crowdfunding Model for Green Energy Investment

Zheng, Ronghuo (Carnegie Mellon University) | Xu, Ying (Carnegie Mellon University) | Chakraborty, Nilanjan (Stony Brook University) | Sycara, Katia (Carnegie Mellon University)

AAAI Conferences

This paper studies a new renewable energy investment model through crowdfunding, which is motivated by emerging community solar farms. In this paper we develop a sequential game theory model to capture the interactions among crowdfunders, the solar farm owner, and an electricity company who purchases renewable energy generated by the solar farm in a multi-period framework. By characterizing a unique subgame-perfect equilibrium, andcomparing it with a benchmark model without crowdfunding, we find that under crowdfunding although the farm owner reduces its investment level, the overall green energy investment level is increased due to the contribution of crowdfunders. We also find that crowdfunding can increase the penetration of green energy in consumption and thus reduce the energy procurement cost of the electricity company. Finally, the numerical results based on real data indicates crowdfunding is a simple but effective way to boost green generation.


A Procedural Characterization of Solution Concepts in Games

Halpern, J. Y., Moses, Y.

Journal of Artificial Intelligence Research

We show how game-theoretic solution concepts such as Nash equilibrium, correlated equilibrium, rationalizability, and sequential equilibrium can be given a uniform definition in terms of a knowledge-based program with counterfactual semantics. In a precise sense, this program can be viewed as providing a procedural characterization of rationality.


Equilibria of Online Scheduling Algorithms

Ashlagi, Itai (Massachusetts Institute of Technology ) | Lucier, Brendan (Microsoft Research New England) | Tennenholtz, Moshe (Microsoft Research, Herzlyia, Israel)

AAAI Conferences

We describe a model for competitive online scheduling algorithms. Two servers, each with a single observable queue, compete for customers. Upon arrival, each customer strategically chooses the queue with minimal expected wait time. Each scheduler wishes to maximize its number of customers, and can strategically select which scheduling algorithm, such as First-Come-First-Served (FCFS), to use for its queue. This induces a game played by the servers and the customers. We consider a non-Bayesian setting, where servers and customers play to maximize worst-case payoffs. We show that there is a unique subgame perfect safety-level equilibrium and we describe the associated scheduling algorithm (which is not FCFS). The uniqueness result holds for both randomized and deterministic algorithms, with a different equilibrium algorithm in each case. When the goal of the servers is to minimize competitive ratio, we prove that it is an equilibrium for each server to apply FCFS: each server obtains the optimal competitive ratio of 2.


Symmetric Subgame Perfect Equilibria in Resource Allocation

Cigler, Ludek (EPFL, Lausanne) | Faltings, Boi (EPFL, Lausanne)

AAAI Conferences

We analyze symmetric protocols to rationally coordinate on an asymmetric, efficient allocation in an infinitely repeated N-agent, C-resource allocation problems. (Bhaskar 2000) proposed one way to achieve this in 2-agent, 1-resource allocation games: Agents start by symmetrically randomizing their actions, and as soon as they each choose different actions, they start to follow a potentially asymmetric "convention" that prescribes their actions from then on. We extend the concept of convention to the general case of infinitely repeated resource allocation games with N agents and C resources. We show that for any convention, there exists a symmetric subgame perfect equilibrium which implements it. We present two conventions: bourgeois, where agents stick to the first allocation; and market, where agents pay for the use of resources, and observe a global coordination signal which allows them to alternate between different allocations. We define price of anonymity of a convention as the ratio between the maximum social payoff of any (asymmetric) strategy profile and the expected social payoff of the convention. We show that while the price of anonymity of the bourgeois convention is infinite, the market convention decreases this price by reducing the conflict between the agents.